Journal: Science Advances
Article Title: Emotion schemas are embedded in the human visual system
doi: 10.1126/sciadv.aaw4358
Figure Lengend Snippet: ( A ) Model architecture follows that of AlexNet (five convolutional layers followed by three fully connected layers); only the last fully connected layer has been retrained to predict emotion categories. ( B ) Activation of artificial neurons in three convolutional layers (1, 3, and 5) and two fully connected layers (6 and 8) of the network. Scatterplots depict t -distributed stochastic neighbor embedding ( t -SNE) plots of activation for a random selection of 1000 units in each layer. The first four layers come from a model developed to perform object recognition , and the last layer was retrained to predict emotion categories from an extensive database of video clips. ( C ) Examples of randomly selected images assigned to each class in holdout test data (images from videos that were not used for training the model). Pictures were not chosen to match target classes. Some examples show contextually driven prediction, e.g., an image of a sporting event is classified as empathic pain, although no physical injury is apparent. ( D ) Linear classification of activation in each layer of EmoNet shows increasing emotion-relation information in later layers, particularly in the retrained layer fc8. Error bars indicate SEM based on binomial distribution. ( E ) t -SNE plot shows model predictions in test data. Colors indicate the predicted class, and circled points indicate that the ground truth label was in the top 5 predicted categories. Although t -SNE does not preserve global distances, the plot does convey local clustering of emotions such as amusement and adoration. ( F ) Normalized confusion matrix shows the proportion of test data that are classified into the 20 categories. Rows correspond to the correct category of test data, and columns correspond to predicted categories. Gray colormap indicates the proportion of predictions in the test dataset, where each row sums to a value of 1. Correct predictions fall on the diagonal of the matrix, whereas erroneous predictions comprise off-diagonal elements. Categories the model is biased toward predicting, such as amusement, are indicated by dark columns. Data-driven clustering of errors shows 11 groupings of emotions that are all distinguishable from one another (see Materials and Methods and fig. S3). Images were captured from videos in the database developed by Cowen and Keltner .
Article Snippet: The pretrained CNN model AlexNet ( ) was downloaded for use in MATLAB.
Techniques: Activation Assay, Selection